Compiler Support for Code Size Reduction Using a Queue-Based Processor
نویسندگان
چکیده
Queue computing delivers an attractive alternative for embedded systems. The main features of a queue-based processor are a dense instruction set, high-parallelism capabilities, and low hardware complexity. This paper presents the design of a code generation algorithm implemented in the queue compiler infrastructure to achieve high code density by using a queue-based instruction set processor. We present the efficiency of our code generation technique by comparing the code size and extracted parallelism for a set of embedded applications against a set of conventional embedded processors. The compiled code is, in average, 12.03% more compact than MIPS16 code, and 45.1% more compact than ARM/Thumb code. In addition, we show that the queue compiler, without optimizations, can deliver about 1.16 times more parallelism than fully optimized code for a register machine.
منابع مشابه
An Efficient Code Generation Algorithm for Code Size Reduction Using 1-Offset P-Code Queue Computation Model
Embedded systems very often demand small memory footprint code. A popular architectural modification to improve code density in RISC embedded processors is to use a dual instruction set. This approach reduces code size at the cost of performance degradation due to the greater number of reduced width instructions required to execute the same task. We propose a novel alternative for reducing code...
متن کاملEfficient Code Generation Algorithm for Natural Instruction Level Parallelism-aware Queue Architecture
Queue based instruction set architecture processor offers an attractive option in the design of embedded systems. In our previous work, we proposed a novel queue processor architecture as a starting point for hardware/software design space exploration for embedded applications. In this paper, we present a high performance 32-bit QueueCore architecture. This work presents novel code generation a...
متن کاملTowards Code Generation for the Synchronous Control Asynchronous Dataflow (SCAD) Architectures
Many recent processor architectures expose their datapaths so that the compiler can not only schedule instructions to increase instruction-level concurrency, but can even take care of moving values between the processing units of the processor to optimize their allocation at compile-time. In this paper, we introduce with the Synchronous Control Asynchronous Dataflow (SCAD) paradigm another expo...
متن کاملCompiler Directed Issue Queue Energy Reduction
The issue logic of a superscalar processor consumes a large amount of static and dynamic energy. Furthermore, its power density makes it a hot-spot requiring expensive cooling systems and additional packaging. This paper presents a novel approach to energy reduction that uses compiler analysis communicated to the hardware, allowing the processor to dynamically resize the issue queue, fitting it...
متن کاملCompiler Support for Work-Stealing Parallel Runtime Systems
Multiple programming models are emerging to address an increased need for dynamic task parallelism in multicore shared-memory multiprocessors. Examples include OpenMP 3.0, Java Concurrency Utilities, Microsoft Task Parallel Library, Intel Threading Building Blocks, Cilk, X10, Chapel, and Fortress. Scheduling algorithms based on work-stealing, as embodied in Cilk’s implementation of dynamic spaw...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Trans. HiPEAC
دوره 2 شماره
صفحات -
تاریخ انتشار 2009